Mini-batch optimization has proven to be a powerful paradigm for large-scalelearning. However, the state of the art parallel mini-batch algorithms assumesynchronous operation or cyclic update orders. When worker nodes areheterogeneous (due to different computational capabilities or differentcommunication delays), synchronous and cyclic operations are inefficient sincethey will leave workers idle waiting for the slower nodes to complete theircomputations. In this paper, we propose an asynchronous mini-batch algorithmfor regularized stochastic optimization problems with smooth loss functionsthat eliminates idle waiting and allows workers to run at their maximal updaterates. We show that by suitably choosing the step-size values, the algorithmachieves a rate of the order $O(1/\sqrt{T})$ for general convex regularizationfunctions, and the rate $O(1/T)$ for strongly convex regularization functions,where $T$ is the number of iterations. In both cases, the impact of asynchronyon the convergence rate of our algorithm is asymptotically negligible, and anear-linear speedup in the number of workers can be expected. Theoreticalresults are confirmed in real implementations on a distributed computinginfrastructure.
展开▼